This article aims to limit the rule explosion problem affecting market basket analysis (MBA) algorithms. More specifically, it is shown how, if the minimum support threshold is not specified explicitly, but in terms of the number of items to consider, it is possible to compute an upper bound for the number of generated association rules. Moreover, if the results of previous analyses (with different thresholds) are available, this information can also be taken into account, hence refining the upper bound and also being able to compute lower bounds. The support determination technique is implemented as an extension to the Apriori algorithm but may be applied to any other MBA technique. Tests are executed on benchmarks and on a real problem provided by one of the major Italian supermarket chains, regarding more than 500, 000 transactions. Experiments show, on these benchmarks, that the rate of growth in the number of rules between tests with increasingly more permissive thresholds ranges, with the proposed method, is from 21.4 to 31.8, while it would range from 39.6 to 3994.3 if the traditional thresholding method were applied.
Loading....